Search CORE

393 research outputs found

fMRI Investigation of Cortical and Subcortical Networks in the Learning of Abstract and Effector-Specific Representations of Motor Sequences

Author: Bapi Dr. Raju S.
Doya Dr. Kenji
Graydon Dr. F. X.
Miyapuram Mr. K. P.
Publication venue
Publication date: 01/08/2006
Field of study

A visuomotor sequence can be learned as a series of visuo-spatial cues or as a sequence of effector movements. Earlier imaging studies have revealed that a network of brain areas is activated in the course of motor sequence learning. However these studies do not address the question of the type of representation being established at various stages of visuomotor sequence learning. In an earlier behavioral study, we demonstrated that acquisition of visuo-spatial sequence representation enables rapid learning in the early stage and progressive establishment of somato-motor representation helps speedier execution by the late stage. We conducted functional magnetic resonance imaging (fMRI) experiments wherein subjects learned and practiced the same sequence alternately in normal and rotated settings. In one rotated setting (visual), subjects learned a new motor sequence in response to an identical sequence of visual cues as in normal. In another rotated setting (motor), the display sequence was altered as compared to normal, but the same sequence of effector movements were used to perform the sequence. Comparison of different rotated settings revealed analogous transitions both in the cortical and subcortical sites during visuomotor sequence learning  a transition of activity from parietal to parietal-premotor and then to premotor cortex and a concomitant shift was observed from anterior putamen to a combined activity in both anterior and posterior putamen and finally to posterior putamen. These results suggest a putative role for engagement of different cortical and subcortical networks at various stages of learning in supporting distinct sequence representations

CogPrints Cognitive Sciences Eprint Archive

PIPPS: Flexible model-based policy search robust to the curse of chaos

Author: Doya K
Parmas P
Peters J
Rasmussen CE
Publication venue: 35th International Conference on Machine Learning, ICML 2018
Publication date: 01/01/2018
Field of study

Previously, the exploding gradient problem has been explained to be central in deep learning and model-based reinforcement learning, because it causes numerical issues and instability in optimization. Our experiments in model-based reinforcement learning imply that the problem is not just a numerical issue, but it may be caused by a fundamental chaos-like nature of long chains of nonlinear computations. Not only do the magnitudes of the gradients become large, the direction of the gradients becomes essentially random. We show that reparameterization gradients suffer from the problem, while likelihood ratio gradients are robust. Using our insights, we develop a model-based policy search framework, Probabilistic Inference for Particle-Based Policy Search (PIPPS), which is easily extensible, and allows for almost arbitrary models and policies, while simultaneously matching the performance of previous data-efficient learning algorithms. Finally, we invent the total propagation algorithm, which efficiently computes a union over all pathwise derivative depths during a single backwards pass, automatically giving greater weight to estimators with lower variance, sometimes improving over reparameterization gradients by 10^6 times

arXiv.org e-Print Archive

TUbiblio

Apollo (Cambridge)

MPG.PuRe

CUED - Cambridge University Engineering Department

Chaotic exploration and learning of locomotion behaviours

Author: Cohen A. H.
Doya K.
Itoh Y.
Kelso J.A.S.
Ott E.
Pearson K. G.
Pfeifer R.
Phil Husbands
Rescorla R. A.
Schultz W.
Shim Y. S.
Stein P. S. G.
Yoonsik Shim
Zhang C. K.
Publication venue: 'MIT Press - Journals'
Publication date: 01/08/2012
Field of study

We present a general and fully dynamic neural system, which exploits intrinsic chaotic dynamics, for the real-time goal-directed exploration and learning of the possible locomotion patterns of an articulated robot of an arbitrary morphology in an unknown environment. The controller is modeled as a network of neural oscillators that are initially coupled only through physical embodiment, and goal-directed exploration of coordinated motor patterns is achieved by chaotic search using adaptive bifurcation. The phase space of the indirectly coupled neural-body-environment system contains multiple transient or permanent self-organized dynamics, each of which is a candidate for a locomotion behavior. The adaptive bifurcation enables the system orbit to wander through various phase-coordinated states, using its intrinsic chaotic dynamics as a driving force, and stabilizes on to one of the states matching the given goal criteria. In order to improve the sustainability of useful transient patterns, sensory homeostasis has been introduced, which results in an increased diversity of motor outputs, thus achieving multiscale exploration. A rhythmic pattern discovered by this process is memorized and sustained by changing the wiring between initially disconnected oscillators using an adaptive synchronization method. Our results show that the novel neurorobotic system is able to create and learn multiple locomotion behaviors for a wide range of body configurations and physical environments and can readapt in realtime after sustaining damage

Crossref

Sussex Research Online

Probabilistic Inference for Fast Learning in Control

Author: A. Girard
C.E. Rasmussen
C.E. Rasmussen
C.G. Atkeson
E. Snelson
J. Peters
K. Doya
R.S. Sutton
S. Schaal
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

We provide a novel framework for very fast model-based reinforcement learning in continuous state and action spaces. The framework requires probabilistic models that explicitly characterize their levels of confidence. Within this framework, we use flexible, non-parametric models to describe the world based on previously collected experience. We demonstrate learning on the cart-pole problem in a setting where we provide very limited prior knowledge about the task. Learning progresses rapidly, and a good policy is found after only a hand-full of iterations

Crossref

Spiral - Imperial College Digital Repository

MPG.PuRe

RGMa inhibition promotes axonal growth and recovery after spinal cord injury

Author: Doya Hideo
Fujitani Masashi
Hata Katsuhiko
Mueller Bernhard K.
Saito Tomoko
Yamagishi Satoru
Yamashita Toshihide
Yasuda Yuichi
Publication venue: The Rockefeller University Press
Publication date
Field of study

Repulsive guidance molecule (RGM) is a protein implicated in both axonal guidance and neural tube closure. We report RGMa as a potent inhibitor of axon regeneration in the adult central nervous system (CNS). RGMa inhibits mammalian CNS neurite outgrowth by a mechanism dependent on the activation of the RhoA–Rho kinase pathway. RGMa expression is observed in oligodendrocytes, myelinated fibers, and neurons of the adult rat spinal cord and is induced around the injury site after spinal cord injury. We developed an antibody to RGMa that efficiently blocks the effect of RGMa in vitro. Intrathecal administration of the antibody to rats with thoracic spinal cord hemisection results in significant axonal growth of the corticospinal tract and improves functional recovery. Thus, RGMa plays an important role in limiting axonal regeneration after CNS injury and the RGMa antibody offers a possible therapeutic agent in clinical conditions characterized by a failure of CNS regeneration

Crossref

PubMed Central

Anything You Can Do, You Can Do Better: Neural Substrates of Incentive-Based Performance Enhancement

Author: A Rangel
B Pleger
B. W Balleine
D Talmi
J. D Salamone
J. O Gan
John P. O'Doherty
K Doya
L Schmidt
L. H Corbit
L. H Corbit
M Liljeholm
M Pessiglione
Mimi Liljeholm
N. D Daw
R. M Krebs
S Bray
S. B Ostlund
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/02/2012
Field of study

Performance-based pay schemes in many organizations share the fundamental assumption that the performance level for a given task will increase as a function of the amount of incentive provided. Consistent with this notion, psychological studies have demonstrated that expectations of reward can improve performance on a plethora of different cognitive and physical tasks, ranging from problem solving to the voluntary regulation of heart rate. However, much less is understood about the neural mechanisms of incentivized performance enhancement. In particular, it is still an open question how brain areas that encode expectations about reward are able to translate incentives into improved performance across fundamentally different cognitive and physical task requirements

Crossref

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Caltech Authors

Stable Propagation of a Burst Through a One-Dimensional Homogeneous Excitatory Chain Model of Songbird Nucleus HVC

Author: C. K. Catchpole
D. Golomb
D. Kincaid
D. Kleinfeld
D. S. Vicario
E. T. Vu
E. T. Vu
F. Nottebohm
G. Turrigiano
Henry Greenside
J. Hertz
J. Huguenard
K. Doya
M. Abeles
M. J. Rosen
M. Kubota
M. Kubota
MengRu Li
N. Wang
P. Bergé
P. Dayan
P. Dutar
R. D. Traub
R. Mooney
T. P. Vogels
V. Litvak
Publication venue: 'American Physical Society (APS)'
Publication date: 08/05/2006
Field of study

We demonstrate numerically that a brief burst consisting of two to six spikes can propagate in a stable manner through a one-dimensional homogeneous feedforward chain of non-bursting neurons with excitatory synaptic connections. Our results are obtained for two kinds of neuronal models, leaky integrate-and-fire (LIF) neurons and Hodgkin-Huxley (HH) neurons with five conductances. Over a range of parameters such as the maximum synaptic conductance, both kinds of chains are found to have multiple attractors of propagating bursts, with each attractor being distinguished by the number of spikes and total duration of the propagating burst. These results make plausible the hypothesis that sparse precisely-timed sequential bursts observed in projection neurons of nucleus HVC of a singing zebra finch are intrinsic and causally related.Comment: 13 pages, 6 figure

arXiv.org e-Print Archive

Crossref

Extraction of Reward-Related Feature Space Using Correlation-Based and Reward-Based Learning Methods

Author: A.G. Barto
B. Porr
F. Pasemann
K. Doya
R. Pfeifer
S. Phon-Amnuaisuk
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Crossref

Risk, Unexpected Uncertainty, and Estimation Uncertainty: Bayesian Learning in Unstable Settings

Author: A Quinn
A Wagner
AC Courville
AJ Yu
AN Hampton
BA Strange
CD Fiorillo
D Draper
D Ellsberg
E Payzan-LeNestour
Elise Payzan-LeNestour
FH Knight
G Aston-Jones
G Vanni-Mercier
GI Christopoulos
J Dow
JD Cohen
JM Keynes
JM Pearce
JO Berger
K Craik
K Doya
K Preuschoff
K Preuschoff
K Sangjoon
LP Hansen
M Allais
M Basili
M d'Acremont
M Hsu
MFS Rushworth
MP Paulus
ND Daw
ND Daw
P Bossaerts
P Dayan
P Diaconis
Peter Bossaerts
PN Tobler
RE Kass
RH Thaler
S Huettel
S Ishii
S Kakade
SA Huettel
TEJ Behrens
Tim Behrens
U Rutishauser
W Epstein
W Yoshida
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2011
Field of study

Recently, evidence has emerged that humans approach learning using Bayesian updating rather than (model-free) reinforcement algorithms in a six-arm restless bandit problem. Here, we investigate what this implies for human appreciation of uncertainty. In our task, a Bayesian learner distinguishes three equally salient levels of uncertainty. First, the Bayesian perceives irreducible uncertainty or risk: even knowing the payoff probabilities of a given arm, the outcome remains uncertain. Second, there is (parameter) estimation uncertainty or ambiguity: payoff probabilities are unknown and need to be estimated. Third, the outcome probabilities of the arms change: the sudden jumps are referred to as unexpected uncertainty. We document how the three levels of uncertainty evolved during the course of our experiment and how it affected the learning rate. We then zoom in on estimation uncertainty, which has been suggested to be a driving force in exploration, in spite of evidence of widespread aversion to ambiguity. Our data corroborate the latter. We discuss neural evidence that foreshadowed the ability of humans to distinguish between the three levels of uncertainty. Finally, we investigate the boundaries of human capacity to implement Bayesian learning. We repeat the experiment with different instructions, reflecting varying levels of structural uncertainty. Under this fourth notion of uncertainty, choices were no better explained by Bayesian updating than by (model-free) reinforcement learning. Exit questionnaires revealed that participants remained unaware of the presence of unexpected uncertainty and failed to acquire the right model with which to implement Bayesian updating

Infoscience - École polytechnique fédérale de Lausanne

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Caltech Authors

University of Melbourne Institutional Repository

A compact statistical model of the song syntax in Bengalese finch

Author: A Krogh
AC Yu
Alexay A. Kozhevnikov
B Olveczky
C Catchpole
C Scharff
D Gil
D Jin
D Jin
D Jurafsky
D Todt
Dezhe Z. Jin
DZ Jin
E Honda
F Nottebohm
H Markram
I Fiete
J Callut
J Kupiec
J Sakata
JS McCasland
K Doya
K Herrmann
K Katahira
K Okanoya
Karl J. Friston
KS Lashley
L Abbott
L Rabiner
M Colonnese
M Long
M Long
M Sanchez-Vives
M Wohlgemuth
MS Fee
P Du
P Janata
P Mitra
P Slater
R Durbin
RH Hahnloser
SM Woolley
T Hosino
W Chang
Y Kakishita
Y Yamashita
Z Chi
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 12/11/2010
Field of study

Songs of many songbird species consist of variable sequences of a finite number of syllables. A common approach for characterizing the syntax of these complex syllable sequences is to use transition probabilities between the syllables. This is equivalent to the Markov model, in which each syllable is associated with one state, and the transition probabilities between the states do not depend on the state transition history. Here we analyze the song syntax in a Bengalese finch. We show that the Markov model fails to capture the statistical properties of the syllable sequences. Instead, a state transition model that accurately describes the statistics of the syllable sequences includes adaptation of the self-transition probabilities when states are repeatedly revisited, and allows associations of more than one state to the same syllable. Such a model does not increase the model complexity significantly. Mathematically, the model is a partially observable Markov model with adaptation (POMMA). The success of the POMMA supports the branching chain network hypothesis of how syntax is controlled within the premotor song nucleus HVC, and suggests that adaptation and many-to-one mapping from neural substrates to syllables are important features of the neural control of complex song syntax

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

PubMed Central